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The detection and estimation of gravitational wave (GW) signals belonging to a parameterized 
family of waveforms requires, in general, the numerical maximization of a data-dependent func- 
tion of the signal parameters. Due to noise in the data, the function to be maximized is often 
highly multi-modal with numerous local maxima. Searching for the global maximum then becomes 
computationally expensive, which in turn can limit the scientific scope of the search. Stochastic op- 
timization is one possible approach to reducing computational costs in such applications. We report 
results from a first investigation of the Particle Swarm Optimization (PSO) method in this context. 
The method is applied to a testbed motivated by the problem of detection and estimation of a 
binary inspiral signal. Our results show that PSO works well in the presence of high multi-modality, 
making it a viable candidate method for further applications in GW data analysis. 
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I. INTRODUCTION 

The detection and estimation of a gravitational wave 
(GW) signal belonging to a parameterized family of 
waveforms requires, in general, the numerical maximiza- 
tion of some data-dependent function over the space of 
the signal parameters. For example, in the matched fil- 
tering P, [4I method, which is the focus of this paper, 
the function to be maximized is a suitably defined inner 
product between the data and parameterized signal wave- 
forms. The global maximum of this function serves as a 
detection statistic. A point estimate of the signal param- 
eters is furnished by the location of the global maximum 
in parameter space. 

The presence of noise in the output of GW detectors 
leads to a large number of local maxima in this func- 
tion that are distributed randomly in parameter space. 
The search for the global maximum in this forest of lo- 
cal maxima then becomes a computationally expensive 
task. This can affect the sensitivity of a search by limit- 
ing either the volume that is searched in parameter space 
or the integration length of data required for accumulat- 
ing sufficient signal-to- noise ratio (SNR), or both. The 
computational efficiency of the search for the global max- 
imum is, thus, an important issue in GW data analysis. 
The various search strategies proposed in the GW liter- 
ature so far can be broadly divided into those based on 
sampling the function on predetermined grids of points 
in parameter space and those that use stochastic 

optimization methods (e.g., [sl-fioj). 
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In the class of grid-based methods, significant savings 
in computational costs have been demonstrated with a 
hierarchy of grids 0, H, 0] • A nice feature of grid-based 
methods is that they are easy to characterize statistically 
and, hence, design variables of the algorithm, such as the 
spacing of points, can be fixed systematically. 

Stochastic methods do not use pre-determined grids 
but employ some form of random walk through the pa- 
rameter space. The probabilistic rules of the random 
walk are tuned to maximize the chances of its terminating 
close to the global maximum. There are many algorithms 
that fall under the class of stochastic methods, a hybrid 
of simulated annealing and Metropolis-Hastings Markov 
Chain Monte Carlo (MCMC) being the most widely ex- 
plored in GW data analysis |8l-[l0|. 

Since the number of points in a grid grows exponen- 
tially with the dimensionality of the parameter space, 
stochastic methods tend to outperform grid-based ones 
with an increase in the number of signal parameters. 
It is worth noting here that stochastic methods in GW 
data analysis incur the additional computational cost of 
generating signal waveforms on the fly. In grid-based 
methods, on the other hand, waveforms can be computed 
and stored in advance of processing the data. Stochas- 
tic methods can, therefore, lose their advantage if the 
computational cost of generating waveforms becomes too 
high. 

The performance of a stochastic method may be sensi- 
tive to the values to which its design variables are tuned. 
Since the tuning is usually done on simulated data, it 
is not clear how robust current stochastic methods are 
against features of real data such as non-stationarity 
and non-Gaussianity. Additionally, the number of de- 
sign variables that require careful tuning is fairly large 
for some of the methods. In such cases, tuning becomes 
more of an art than a well defined procedure and this may 
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also affect robustness. In some methods, prior informa- 
tion is used about generic features of the function to be 
maximized. This may not be rehable if the assumptions 
behind the prior information, such as a particular noise 
model, become invalid. To properly address issues such 
as these it is important that a wide variety of stochastic 
methods be explored in GW data analysis . 

Particle Swarm Optimization (PSO) [llj . first pro- 
posed by Kennedy and Eberhardt in 1995, is a stochas- 
tic method that has been garnerin g a lot of attention 
recently in many application areas [12|. An attractive 
feature of PSO is that, in its basic form, it has a small 
number of design variables. On standard testbeds, PSO 
has been found to have comparable or superior perfor- 
mance to other well known methods such as MCMC. 

This paper presents the first application of PSO to GW 
data analysis. We pose the following specific questions: 

1. Is PSO a viable method when applied to a function 
that is highly multi-modal and essentially stochas- 
tic in nature? This is the typical case in GW data 
analysis. 

2. How many design variables are there in PSO, and 
how many of them need to be tuned well? 

3. Can the tuning of these variables be done with- 
out requiring prior information about features of 
the function, thus increasing the robustness of the 
method? 

4. What is the computational cost of the method and 
what are the most important technical improve- 
ments required for the future? 

To answer these questions in the most direct and reliable 
manner, we construct a testbed based on the well un- 
derstood task of detecting and estimating binary inspiral 
signals in data from a single ground-based detector. This 
problem involves low-dimensionality but offers the more 
serious challenge of high multi-modality. To keep the fo- 
cus on the latter, a simplification is made regarding the 
shape of the search region such that it admits unphysical 
waveforms. Thus, the implementation of PSO presented 
here is not directly applicable to binary inspiral searches 
at present. The required technical refinements are dis- 
cussed in the paper. In addition, a novel and system- 
atic tuning procedure is introduced that is based on data 
containing only noise. This procedure may be useful for 
other stochastic methods also. 

The rest of the paper is organized as follows. Sec. [ll] 
describes the testbed and Sec. IIIII describes the PSO 
method. We explain our procedure for tuning the de- 
sign variables of PSO in Sec. IIVI Sec. |V] then presents 
results from numerical simulations. Our conclusions and 
pointers to future work are presented in Sec IVIl 



samples. The k^^ sample, < fc < A'^ — 1, is denoted 
by x[k]. 

Ss, T: The sampling interval and the duration of x re- 
spectively. The number of samples in x is iV = 
[T/(5s], where the square brackets denote trunca- 
tion to the nearest integer. 

O : The set of parameters describing a family of signals. 

s(8): The time series of the signal corresponding to 
parameter values O. The fc"^ sample of s(8) is 
denoted by s[k;Q]. In our case, signals have a 
well-defined start and stop time and the interval 
between them may be less than T. However, s(8) 
still consists of N samples with the samples outside 
the interval enclosed by the start and stop times set 
to (zero-padding). 

x: The Discrete Fourier Transform (DFT) of x. 
The DFT value at the frequency k/T, k = 
0, 1, ... , [N/2 + 1], is denoted as x[k]. The DFT 
of s(0) is denoted by s^(O) and its value at the k*"^ 
frequency by s'[fc; &]. 

(x, y) : The time series introduced above are elements 
of M^, the vector space of real A^-tuples. From 
the point of view of detection and estimation of 
a signal in data with additive stationary noise, a 
natural inner product can be introduced on this 
vector space, 



(x,y) 




(1) 



where Sn [k] is the one-sided Power Spectral Density 
(PSD) of the noise. 



|y|j: the norm on M^, 

l|yf = (y,y) 



(2) 



induced by the inner product defined above. The 
signal to noise ratio (SNR) of a signal s(0) is de- 
fined as ||s(e)||. 



II. TEST BED 

In this section, we describe the testbed to which PSO 
is applied. The testbed is constituted by the noise model, 
signal family and the function to be maximized. 

A. Noise model 



Notation 



x, y, etc: A time series with a finite number, N, of 



A GW signal incident on an interferometric ground- 
based detector produces a difference in the lengths of its 
two arms. After calibrating out the common arm length 
and the transfer function of the detector, the data, x. 



3 



contains the measured GW-induced strain added to in- 
strumental and environmental noise n. Thus, x = n 
when no GW signal is present, and x = s(8) + n when 
there is. In our simulations, n is a realization of a sta- 
tionary, Gaussian noise process with a PSD, S'„[fc], that 
matches the initial LIGO design sensitivity curve in 
shape [III- 

B. Signal waveforms 

We use the signal family associated with a non- 
spinning inspiraling binary system, cornputed up to the 
second post-Newtonian (2PN) order [l5|. This sys- 
tem consists of two non-spinning compact stars (Neu- 
tron Stars or Black Holes) losing orbital binding energy 
through GW emission. Members of this signal family 
have chirp waveforms with monotonically increasing in- 
stantaneous amplitude and frequency. 

For the case of a single detector, the parameters speci- 
fying the 2PN signal waveforms can be grouped into two 
sets. The first set is that of the chirp-time 16] param- 
eters, {Ta}, a = 0,1,1.5,2, that are constructed out of 



where, the lower cutoff frequency fa was explained above 
and the upper cutoff frequency fc follows from the termi- 
nation of the inspiral waveform when the binary reaches 
its last stable orbit. For our testbed, we set fc = 700 Hz 
although in a real search it depends on the mass of the 
binary system. The expression for ipif] ^) is given in Ap- 
pendix [Xj The normalization constant Af is defined such 
that, \\s{{A = l,<^a,ta,d}W = 1- It follows that A is 
the SNR of the signal. 

Later on, we use the fact that although not all values 
of 9 correspond to valid binary mass components, Eq. [3] 
can still be used to generate perfectly normal waveforms. 
These waveforms are also chirps but their phase evolution 
does not correspond to any physical binary system. 



C. Fitness function 

The function to be maximized is, 

A(i„0|x) = [(qo(ta,0),x)2 + 

(q^/2(t„,0),x)2]^/' , (4) 

q4ta,e) = s{{A=l,^a^(l),ta,0}) . (5) 



the masses of the two components of the binary. Expres- 
sions for the chirp-time parameters are provided in Ap- 
pendix El The second set consists of the time- of- arrival, 
ta, the initial phase, $a and the amplitude, A. Interfer- 
ometric ground-based detectors have a sharp rise in seis- 
mic noise below some frequency /q ( = 40 Hz for inital 
LIGO). The chirp signal from a binary inspiral is essen- 
tially unobservable when its instantaneous frequency is 
below fa- The time at which the signal becomes visible 
is ta and the corresponding instantaneous phase of the 
signal is 

Since all the four chirp-times depend on the masses of 
the two compact stars, only two of them are independent. 
We choose tq and ti 5 as the two independent chirp-time 
parameters. Thus, the set of signal parameters is = 
{A, <S>a,ta,e = {ro,ri.5}}. 

As discussed in Sec. HI the computational cost of gen- 
erating waveforms on the fly is important for stochastic 
methods like PSO. The 2PN signal family is amenable 
to a fast implementation because a sufficiently accurate 
analytical form exists for the Fourier transform of these 
waveforms [l7| . 



(3) 

I 

This function is obtained by maximizing the log- 
likelihood, (x,s(e)) - (l/2)||s(e)|p analytically over A 
and $a. 

For given 6, the evaluation of {c[^{ta,0),x) over ta = 
mSs, m = 0,1, . . . , N — 1, is a. cross-correlation operation 
that can be computed efficiently using the Fast Fourier 
Transform (FFT). Thus, the function that is maximized 
using PSO is, 

X{e\x) = ma.xA{ta,e\x) . (6) 

In the remainder of the paper, A(6'|x) will be called the 
fitness function in keeping with the standard terminology 
used in much of the literature on stochastic methods. 

The presence of noise in x makes the fitness function 
highly multi-modal as shown in Fig [T] The large number 
of local maxima with random locations and sizes poses a 
strong challenge to stochastic methods. When the noise 
is stationary and Gaussian, and the signal present in the 
data is from the waveform family that one is searching 
for, certain characteristic features are present in the fit- 
ness function. For example, the shape of the peak in 
Fig. [1] is elongated on the average along a predictable di- 
rection. MCMC methods in the GW literature use this 
type of prior information about the fitness function in 
tuning the design variables @. 



Jik; 6] 




k < [faT] , 

exp [-27rifta + i^a - ^i'if■, 0) + if] , [faT] <k< [fcT] 

k > [fcT] 



4 



6.5 ~ 




FIG. 1: A realization of the fitness function for the binary in- 
spiral testbed. The data contains a signal with an SNR=8.0. 
In the absence of noise, the fitness function has only one ex- 
tremum at the location identified by the chirp-times of the 
signal. The presence of noise leads to a forest of local max- 
ima. 

III. PARTICLE SWARM OPTIMIZATION 

The PSO algorithm is first described in terms of a 
general fitness function X{9), over some parameter set 
9. Later, we specialize the discussion to the case of the 
binary inspiral testbed. 

A. The PSO Algorithm 

Let 9 = {9i, 02, • • • , 9d} denote a point in M^, and X{9) 
be the fitness function. The essential idea behind PSO is 
to compute X{9) simultaneously at several locations and 
use these samples to influence the locations for computing 
the next set of samples. This process continues iteratively 
until some stopping rule is satisfied. The process can be 
visualized by treating the sample locations as a swarm 
of particles that moves in R^, hence the name of the 
algorithm. A precise description now follows. 

Let the coordinates in M.^ of the i"^ particle in a swarm 
of Np particles be Qi[k] at the k*"^ step in the search 
(fc = 0, 1, . . .). Associated with this particle is a velocity 
vector Vi [k] that determines Qi [fc -I- 1] , 

Q,[fc + 1] = Q.,[k]+V^[k] . (7) 

The PSO algorithm is usually started with randomly cho- 
sen particle locations and velocities. In our implementa- 
tion, we position the particles initially on a regular grid 
while the initial velocities are kept random. 

Let the maximum value of X{9) found by the i*^ parti- 
cle over k steps be Ri{k) and the location of Ri{k), called 
the particle's best location pbest, be Pi[k]. Thus, 

R^{k) ^ X{P,[k]) > HQ^[J])■, 3<k- (8) 



Let the maximum over {Ri{k)}, i = 1, . . . , Np, be Rg{k) 
and its location, called the global best location ghest be 

R,{k)^X{Pg[k])>X{P,[k])- yj. (9) 

At any step, there is always one particle in the swarm 
whose pbest is also the ghest. We call this particle the 
best particle at step fc. Note that both pbest and gbest 
are locations found over the entire past history of the 
motion of the particles. They need not necessarily change 
at every step. 

The velocity for the i^^ particle at the next step, fc-l- 1, 
is determined by the dynamical equation, 

F,[fc + 1] = wV^[k] + CiXliPr[k] - Qr[k]) + 

C2X2{Pam - Q^[k]) , (10) 

where w, which can depend on fc, is called the inertia 
weight, ci and C2 are called acceleration constants, and 
Xi, X2 are random numbers drawn independently at each 
step from the uniform distribution on [0, 1]. 

Finally, for any component Vi_m[p] of the particle ve- 
locity V,[p] - {V,,i[p],...,V^,d[p]), 

Tr r 1 _ / ^nax,m 5 ^,m[p] ^nax,m /'I 1 ^ 

where Kiax,fc > 0, fc = 1,...,D, and Knax = 
^d) is called the maximum velocity. 

Like all stochastic methods, PSO involves a compe- 
tition between wide ranging exploration of the fitness 
function and convergence to a best value. In order to 
avoid trapping by a local maximum, the method must be 
able to explore other parts of the parameter space, while 
to find the global maximum, the method must eventu- 
ally explore a progressively smaller region around some 
point. The way this competition is implemented in the 
PSO algorithm is seen clearly from Eqs. [TUl The first 
term simply moves a particle along a straight line, while 
the remaining two terms are sources of acceleration, one 
pulling it towards its pbest and another pulling it towards 
gbest. The last two effects are combined with random 
weights xi and X2- The random defiections and inertial 
motion allow a particle to explore the fitness function, 
while the attractive pulls of pbest and gbest counter this 
behavior. With a dynamic inertia weight that decreases 
in time, the attractive pull eventually wins over. A rudi- 
mentary emulation of real biological swarming behavior 
is built in through each particle being aware of gbest. 

The PSO algorithm has another interesting feature. 
The best particle, by definition, has its pbest Pi[k] coin- 
cident with gbest, Pg[k], making the terms Pi[k] — Qi[k] 
and Pg[k] — Qi[k] equal for it. This particle then acceler- 
ates towards gbest alone and only moves along a straight 
line through this location. This situation continues un- 
til a new gbest is found. In effect, one particle at any 
step shows a convergence behavior, exploring the neigh- 
borhood of the current ghest, while the other particles 
continue their exploration. 
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Termination criterion 



C. Search Boundary 



For stochastic methods, the probabihty of convergence 
to the global maximum is usually guaranteed only in the 
asymptotic limit. Hence, any practical implementation of 
a stochastic method must include a criterion for termi- 
nating the search. The criterion we adopt for termination 
is specific to the fitness function for the binary inspiral 
testbed and, accordingly, 6 now refers to the chirp time 
parameters. 

If the particles in PSO continue to move over several 
steps but do not find a significantly different gbest, it 
is likely that the current gbest lies close to the global 
maximum. A natural criterion for termination then is 
to check if gbest stays confined to a small region over a 
predetermined number of steps. 

When the data contains only a signal, x = 
s({^, ^a,ta,9}), the fitness function is maximum at the 
location 9 of the signal. ( s{9) = s{{A,^a,ta,0}) for 
brevity in the following since the other parameters do 
not figure in the fitness function.) The fractional drop 
in the fitness function for a small displacement A9 = 
{A9i = Atq, A02 = Ati.s) is given by, 



I - 



x{9 + Ae\s{e)) 

mm) 



T-Lij — 



-Ei,=in..,Ae.A9, 
2\{d\s{e)) 

d^\{9'\s{9)) 



d9'd9', 



(12) 



(13) 



where 9 is the location of the signal. For a small frac- 
tional drop a, therefore, we get an ellipsoidal region Sa{9) 
centered at 9 such that X{9'\s{9)) > (1 - a)X{9\s{9)) if 
9' e 5^(0). 

Now, the neighborhood of the global maximum in the 
presence of noise is also Sa (9) on the average for a frac- 
tional drop a. Therefore, it is natural to choose the re- 
gion of convergence to be Sa (9) in general. This reduces 
the task of specifying the region to simply choosing a 
value for a. Following a convention widely used in the 
GW literature ^1, we fix a = 0.03. 

Thus, we arrive at the following criterion for terminat- 
ing PSO. At each step k, (i) obtain the ellipsoid around 
gbest, that is, Sa{Pg[k]). (ii) If the best location Pg[k + 1] 
falls outside SaiPg[k]), then reset the region of conver- 
gence to the new best location, i.e., use SQ{Pg[k + 1]). 
(iii) If the region of convergence is not found to change 
over Nt successive steps, then terminate PSO. 

The termination criterion implies that if PSO termi- 
nates near the true global maximum, the fitness value 
found will have a fractional drop less than a. Conse- 
quently, it will have a performance comparable to a grid- 
based search in which the templates are spaced according 
to the minimal match criterion [5] and the minimal match 
is 1 — a. This is important for situations where a grid- 
based search is infeasible as it guarantees that PSO will 
perform as well or better. The probability of convergence 
to the global maximum must be high, however, and this 
is the objective of the tuning process described later. 



Even with the termination criterion in place, the search 
region must be finite in order for PSO to terminate in a 
finite number of steps. Otherwise, the swarm may con- 
tinue to find a better gbest and the termination criterion 
may never be satisfied. This is especially relevant in the 
case when the data has only noise. Thus, the PSO dy- 
namics must be supplemented with appropriate bound- 
ary conditions. Many approaches to this problem have 
been proposed, with a good summary provided in ^isj . In 
this paper, we use the invisible wall boundary condition, 
but we also briefiy describe some of the others below. 



1. Types of boundary conditions 

The boundary conditions proposed in the PSO litera- 
ture are as follows. (This list is taken from 18] and is by 
no means an exhaustive one.) 

Absorbing walls - When a particle crosses a rectangular 
boundary, the velocity component perpendicular to 
the boundary is zeroed. Eventually, this allows the 
particle to be pulled into the search domain. 

Reflecting walls - As with the absorbing walls condi- 
tion, the particle velocity is altered but instead of 
being zeroed, the velocity component perpendicu- 
lar to the wall is reversed in sign. This throws the 
particle back into the search domain. 

Invisible walls - No change is made to the dynamics of 
the boundary crossing particle but X{9\x) is set to 
zero and it is not evaluated as long as the particle 
stays outside the boundary. 

We have tried all three boundary conditions but, like the 
authors of 18], we find that the invisible wall condition 
tends to perform better than the other two. 



2. Search region for the testbed 

The simplest search region in 9 parameter space is a 
rectangle To,mm < tq < To,max and ri.5,„„ < ri.5 < 
Ti.5^max- A part of this region, however, admits wave- 
forms that do not correspond to a physically valid bi- 
nary system. This is due to the dependence of tq, T1.5 on 
the symmetric combinations of binary component masses 
M and /i, the total and reduced mass of the binary re- 
spectively, and the inequality M > 4/i. Nonetheless, as 
remarked in Sec. IIIBl there is no technical problem in 
generating waveforms corresponding to the unphysical 
chirp times and nothing strange happens to the fitness 
function there. See Fig. [TJ for example, where a part of 
the parameter region shown is unphysical. Since the pri- 
mary utility of the binary inspiral problem in this paper 
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is to provide a testbed for PSO, this physical constraint 
is ignored. 

The rectangular search region allows the coordinate 
transformation, 

Xl = (tq — To^rnin)/iTo,max — To^min) , (14) 
X2 = {ti.5 — Ti,5^rnin)/{Tl.5.max — Ti,5^min) , (15) 

such that Xi e [0,1], i — 1,2. In our codes, all PSO 
equations use xi and X2 and the corresponding velocity 
components. 

We choose ro,„ii„ = 0.94 sec, TQ^,nax = 37.48 sec, 
Ti.5,mm = 0.234 sec and Ti.5^,„ax = 1-021 sec. With tq 
and Ti.5 along the horizontal and vertical axes respec- 
tively, the upper right hand corner corresponds to binary 
component masses (in Mq) mi = 1.1 and m2 = 1.1. The 
lower left hand corner corresponds to mi = 10.5 and 
7712 = 9.7. 



D. PSO design variables 

One of the questions posed at the beginning was about 
the number of design variables in PSO. In our implemen- 
tation, there are a total of 9 that are listed below for 
reference. 

Np] Number of particles in the swarm. 

Ci, C2'. Acceleration constants. 

Vmax ■ Maximum velocity of a particle. 

a, Nt'. The parameters used to specify the termination 
criterion for PSO. These parameters are not part 
of the standard PSO algorithm. 

Parameters governing the inertia decay law: The iner- 
tia weight is decreased in value as PSO progresses 
through a search. The PSO literature is full of 
different types of decay laws but, in general, it 
is known that a strictly linear decay law is not 
very useful. We have developed the following de- 
cay law that has elements of both linearity and non- 
linearity. Let w[k] be the value of the inertia weight 
at step fc, 

w[k] =wo- m{k - ko)/Nt , (16) 

where wq > and m > 0. The parameter feg starts 
with an initial value of fco = and is kept fixed 
as long as gbest stays within the current region of 
convergence. If gbest exits the convergence region 
at some step k' without termination, /cq is set equal 
to k'. Thus, the value of the inertia is reset to the 
starting value of wq every time termination fails 
and the linear decay of the inertia starts anew. 



Nrep- For given data x, independent runs of PSO yield 
different values of A(6'|x) corresponding to the dif- 
ferent fitness values at termination. This is un- 
avoidable for any stochastic method. However, ter- 
mination near the true global maximum in indepen- 
dent runs of PSO on the same data should result 
in the clustering of the different values found and 
their locations. We can turn this argument around 
by running PSO independently several times on the 
same data and using the formation of a cluster as an 
indicator of successful termination in the vicinity of 
the global maximum. The number of independent 
runs of PSO on the same data, Nrep, is also a design 
variable. 



IV. TUNING THE DESIGN VARIABLES 

For any stochastic method, convergence to the global 
maximum can only be quantified as a probability. In 
some asymptotic limit, such as particle number Np oo 
for PSO, this probability becomes unity. However, this 
also implies an infinitely large computing cost. Thus the 
design variables must be tuned to find the best trade-off 
between the probability of convergence and the associ- 
ated computational cost. We present here the procedure 
followed for tuning the design variables of PSO. 

In contrast to the tuning procedure used for most 
MCMC methods in the GW literature, our approach is 
not based on data containing a signal but data that is 
purely noise. The latter is the worst case scenario for 
any stochastic method. However, good performance in 
the pure noise case more or less guarantees success when 
a signal is present. Moreover, this approach to tuning 
avoids any bias due to the use of a particular set of sig- 
nals or SNRs. 

The tuning procedure presented here can be used, in 
principle, to tune all the nine design variables of PSO 
(c.f.. Sec. IHID)) . However, applying the procedure to all 
of them is computationally too expensive, at least for 
the objectives of this paper. We focus instead on two 
of the most important variables for the performance of 
PSO, Np and Nf. For the rest, we either choose values 
commonly used in the literature or simply pick reasonable 
ones based on our experience with PSO. Thus, we set : 
ci = C2 = 2, V^ax = (0.5,0.5), wo = 0.9, 777 = 0.4, 
a = 0.03 and Nren = 5. 



A. Criterion for optimal tuning 

Measuring the probability of convergence for the pure 
noise case presents a practical problem. In simulations 
where a large SNR signal is present, we know that the 
true maximum is most likely to be in close proximity to 
the location of the signal and it can be found reliably 
using, say, a small area grid-based search. For the pure 
noise case, however, the location is not known a priori, 
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even approximately, and the only reliable solution is a 
grid-based search over the entire search region. However, 
we avoid this solution because (i) the simulations become 
computationally very expensive, and more importantly, 
(ii) it would fail for higher dimensional problems where 
grid-based searches are infeasible. 

To circumvent this problem, we invoke the argument 
outlined in Sec. lIIIDl for using N^-ep wherein termination 
in the vicinity of the global maximum is indicated by the 
clustering of the fitness and parameter values over inde- 
pendent runs of PSO. One way to further confirm the 
association between a cluster and the global maximum is 
to increase the number of particles significantly and ver- 
ify that a cluster forms around the same location. This 
is similar to what is done, for example, in the numerical 
solution of differential equations. To check that a given 
solution is valid, the computational grid is made denser 
and the new solution is compared with the old one. The 
above ideas can be quantified as follows, allowing an ob- 
jective criterion for tuning to be developed. 

Let there be a number of independent trials, in each of 
which a new realization n of noise is obtained and PSO 
is run N^ep times on n. Thus, in each trial, N^ep values 
are obtained for each of the chirp-times tq and ri.5, and 
the corresponding fitness values \{9\n). We define a set 
of Nrep = 5 numbers to be clustered if at least 3 of them 
lie in a range that is less than 30% of the entire range 
of the 5 numbers. This definition of clustering is applied 
to each of the three sets of Nrep values above. We then 
define. 

Probability of clustering: Let Ptq , 5 and P\ be the 
fraction of trials in which clustering occurs for tq, 
ri.5 and \{9\n) respectively. The maximum among 
Pr„, Pti ^ and P\ is defined as the probability of 
clustering. 

Consistency of clustering: If, for a given realization of 
noise, the N^ep fitness values are found to be clus- 
tered, then the cluster is defined to be consistent 
if (i) the fitness values are also clustered for N'^ 
sufficiently greater than Np, and (ii) the absolute 
difference between the maximum fitness values p 
and p' , corresponding to Np and N'p respectively, 
is < 10% of their mean, [p + /o')/2- We define the 
consistency of clustering as the fraction of trials in 
which the clusters are consistent. 

We deem a given combination of design variable values 
acceptable if both the probability and the consistency of 
clustering exceed 0.9 for that combination. Of all the 
combinations that are acceptable, the optimal is chosen 
to be the one that has the lowest computational cost in 
terms of the mean number of template evaluations. 

B. Simulations 

The tuning procedure described above is now applied 
to the two design variables Np and Nt . The following set 



TABLE I: Computational cost of PSO on data with no signals. 
For each combination of Np and Nt, the mean number of 
fitness function evaluations is listed along with the maximum 
(superscript) and minimum (subscript) over 50 trials. The 
mean values have been rounded off to the nearest integers. 





Np = 42 


81 


121 


Nt = 20 
40 
80 
120 
160 


8309i^g** 

1740li^ir 
28920ilgf 
38567iMig 
48147iiiEg 


31694«ii^ 
52669ifiii 
69982t|l?g 
86759jgglf 


25006?igg 
446326ii;5 

74115iffi 
101495^111° 
126346jiJfg5 



of points is used to find the acceptable combinations: 

Np e {42,81,121} 

Nt e {20,40,80,120,160} 

This particular domain in the Np-Nt plane is chosen 
based on our empirical experience with PSO. (Np — 42 
corresponds to a 7-by-6 grid of initial positions, 81 to a 
9-by-9 and 121 to an 11-by-ll one.) The number of trials 
is 50 and each realization of noise is 64 seconds long with 
sampling interval Ss = 1/2048 sec. 

The tuning procedure proceeds as follows. 

a. Computational cost - For each point in the Np- 
Nt plane, we record the mean number of fitness function 
evaluations. The results are shown in Table HI 

b. Probability of clustering - Table |TT] lists the prob- 
ability of clustering for each combination of Nt and Np. 
Note that for the combination Np = 121 and Nt = 40, 
Pro — 94% is very different from P\ — 76% and P^^ ~ 
80%. This suggests that the abnormally high value of 
Prg here is most likely a statistical outlier. Therefore, we 
do not consider this combination as having a probability 
of clustering > 90%. 

c. Consistency of clustering - Referring to Table [Til 
we see that the consistency test is required only for 
Np > 81 and Nt > 80 for which, as per the definition 
of acceptability above, the probability of clustering is 
> 90%. Further, for a given Nt, the computational cost 
is lower for Np — 81 than Np = 121. Hence, we only tune 
over Nt > 80 for Np = 81. No extra work is required for 
obtaining these results since for each trial, the same data 
realization was used for both Np = 81 and Np = 121 and 
the latter can be used to check if a cluster found by the 
former was consistent or not. In other words, Np = 81 
and Np — 121 in the definition of the consistency of clus- 
tering given earlier. 

We obtain the following results for the consistency of 
clustering: 91%, 93% and 95% for Nt = 80, 120 and 160 
respectively. Thus, according to our final criterion, we 
pick Np = 81 and Nt = 80 as this is the acceptable com- 
bination with the lowest computing cost (c.f., Table U). 
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TABLE II: Probability of clustering for different combinations 
of Np and Nt- For each combination, the fraction of trials 
(in %) Pa, Ptq and Pti 5 for which the fitness, tq and ri.5 
values respectively were found to be clustered are listed. The 
probability of clustering, shown in bold, is the maximum over 
Pa, Ptq and 5. The number of trials for each combination 
is 50. 





Np = 42 


81 


121 






60 


70 


Nt = 20 


CP^ )74 


72 


82 




(P^ 168 


72 


82 




72 


76 


76 


40 


82 


88 


94 




86 


76 


80 




84 


84 


90 


80 


84 


90 


92 




88 


86 


92 




72 


88 


96 


120 


78 


92 


92 




68 


88 


96 




82 


86 


94 


160 


88 


86 


94 




78 


80 


92 



TABLE III: Statistical difference in the distribution of max- 
imum fitness values. The table entries are the significance 
values from a two sample Kolmogorov-Smirnov test with the 
null hypothesis: the ma^ximum fitness values from trials that 
show clustering of all parameters and trials that do not are 
drawn from the same parent distribution. The numbers in 
parentheses are the significance values for the test done with 
minimum fitness values. 





Np = 81 


121 


Nt = 80 


0.5(2 X 10-2) 


0.9(0.7) 


120 


0.9(2 X 10-2) 


0.4(0.4) 


160 


0.7(8 X 10-*) 


0.4(0.9) 



clustering, the PSO run that yields the maximum fitness 
value terminates, with high probability, in the vicinity 
of the global maximum for the Np — 81 and Nt — 80 
combination. 

We have traced the lack of clustering to the presence 
of distant peaks in the fitness function that are similar in 
value. The probability of this happening in the presence 
of a sufficiently strong signal is very small, but this need 
not be so for noise-only data. 



C. Trials with no clustering 



So far, we have focussed on clustering as the main in- 
dicator of success in locating the true global maximum. 
Does this imply that in the trials in which there is no 
clustering, PSO fails to locate the global maximum? To 
address this, we carried out the following test. First, we 
retain the maximum among the Nrep fitness values from 
each trial. For each combination of Np and Nt, we divide 
the set of maximum fitness values into two disjoint sub- 
sets: one in which all parameters, the two chirp-times and 
the fitness, were clustered and the other in which at least 
one parameter did not show clustering. For the former 
set, clustering of all three parameters is a strong indica- 
tor of successful termination near the global maximum. 
A two-sample Kolmogorov-Smirnov (KS) test ^9|] is car- 
ried out to see if the two subsets were drawn from the 
same parent distribution. The results are summarized in 
Table inil 

As can be seen from the table, in all cases the test 
supports the hypothesis that the maximum fitness value 
is drawn from the same distribution irrespective of the 
clustering of the parameters. That this is a non-trivial 
result is further supported by the fact that if the same 
test is done with fitness values other than the maximum 
one, the null hypothesis is rejected strongly. Table IIIII 
shows the results from the same test but using the set of 
minimum fitness values. In this case, it is seen that the 
values are drawn from different distributions, at least for 
Np = 81. Thus, we conclude that even in the absence of 



D. Comments 

We have demonstrated a systematic tuning procedure 
for the design variables of PSO. It is important to note 
that no prior information about any special features of 
the fitness function was used. Hence, the procedure 
would stay the same if the testbed were changed. 

A larger number of trials or a finer spacing of grid 
points in Np and Nt will probably lead to a different end 
result. Instead of Np = 81, for example, Np — 121 may 
turn out to be the right choice. However, the main goal 
in this paper is to test the viability of PSO and, for this 
purpose, a coarse tuning such as the one presented here 
is adequate. Besides, a significant investment in refining 
the results of the tuning procedure would be rendered ob- 
solete with future improvements in the implementation 
of PSO. Until a version of PSO is developed that is hard 
to improve upon, a strong focus on the results from tun- 
ing, as opposed to improvements to the tuning procedure 
itself, is not of much use. 

For Np ^ 121, Table Hill shows that the minimum fit- 
ness values obtained with and without clustering are also 
mutually consistent. This observation suggests an alter- 
native approach to tuning where one of the measures used 
for picking the optimal combination is this type of con- 
sistency. We leave this for future work to address. 
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FIG. 2: The region in the mi, m2 plane corresponding to 
the physically valid part of the tq, ti.s plane. The region is 
indicated by taking a regular grid of points in the to, ti.s 
plane and mapping them to the corresponding values of mi, 
m2, where by convention mi > m2. The ★ markers shows the 
signal locations used in the simulations. 



V. RESULTS WITH SIGNAL PRESENT 

In this section, we describe the results of simulations 
performed with signals added to data. We quantify the 
performance of PSO at four different values of signal SNR 
and four different locations in the tq, ri.5 plane, 

SNR e {9.0,8.0,7.0,6.0}, 
(rcTi.s) e {(5.0,0.6), (10.0,0.75), 

(16.0,0.762), (20.0,0.9)} . 

(The units for both tq and ri.5 are in seconds). 
The corresponding masses (in Mq) of the binary 
components are, respectively, {(mi — 7.78, m2 = 
1.91), (4.71, 1.35), (2.40, 1.40), (2.61, 1.03)}. Fig. Ushows 
the physical part of the search region mapped into the 
mi, TO2 plane along with the signal locations. 

For each combination of signal location and SNR, 50 
independent data realizations are generated. The length 
of each realization is 64 sec, with 6s — 1/2048 sec, and 
the signal is added at an offset of 10 sec from the start. 



A. Qualitative changes induced by a signal 

It is instructive to observe how a signal affects the be- 
havior of the swarm. In general, the presence of the signal 
leads to a broadening of the peak in the fitness function. 
As is well known, the broadening is more pronounced in 
one direction, due to the correlation between estimation 
errors, leading to the appearance of a thin ridge-like fea- 
ture (c.f., FigP. 



000 




FIG. 3: Evolution of a swarm in the presence of a signal. 
The 'o' and '*' markers show the pbest locations of A'p — 81 
particles when 5% and 60% of the total number of steps were 
completed respectively. The lines show the paths followed by 
the pbest locations of 5 representative particles between these 
two steps. With time, the pbest locations tend to congregate 
around the ridge-like feature produced by a signal. 

The particles begin by moving randomly in the param- 
eter space but each time a particle crosses the ridge, its 
pbest tends to fall closer to the flanks of the ridge. As 
time progresses, the pbest of all particles cluster around 
the ridge. This increases its attractive power in the accel- 
eration of the particles, progressively drawing more par- 
ticles into exploration of the fitness function along the 
ridge. 

Fig.Oshows snapshots of PSO at different stages in the 
search and the progressive clustering of pbest locations is 
seen clearly. A key point to note here is that no prior 
knowledge is built into PSO about the ridge-like feature. 
It is found by the particles as they explore the search 
region. 



B. Figures of merit 

In order to quantify the performance of PSO in the 
presence of a signal, we look at two figures of merit. The 
first is the probability of clustering defined in Sec. IIV Al 
Since the tuning procedure requires a minimum value of 
90%, the probability of clustering in the presence of a 
strong signal should be significantly higher but it should 
be consistent with the pure noise case for weak signals. 

When a signal is added to the data, we do not need the 
consistency of clustering criterion of Sec. IIV Al in order to 
confirm the association of a cluster with the global maxi- 
mum. Since we know the location of the signal and since 
the expectation of the fitness function must be maximum 
at that location, it suffices to check if the maximum fit- 
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TABLE IV: Probability of clustering for simulations with sig- 
nal present in the data. For each combination of signal SNR 
and location, the fraction of trials (in %) Pa, Ptq and Pn 5 for 
which the fitness, tq and ri.5 values respectively were found to 
be clustered are listed. The probability of clustering, shown 
in bold, is the maximum over Px, Ptq and P^ 5 . The number 
of trials for each combination is 50. 





\i Of ' 1.0 / 

= (5.0, 0.6) 


(10.0,0.75) 


(16.0,0.762) 


(20.0, 0.9) 




(P\]98 


94 


94 


90 


SNR=9.0 


(^^^0)94 


92 


96 


98 




(P.1.J94 


94 


94 


94 




96 


98 


98 


94 


8.0 


96 


96 


92 


92 




92 


96 


92 


96 




96 


92 


98 


92 


7.0 


96 
90 


90 

88 


94 
98 


88 
90 




92 


82 


88 


94 


6.0 


94 


86 


86 


96 




84 


86 


84 


96 



11 r 

10 - 

O Q - 



.r.-.y" 



6 7 8 

Fitness at signal location 



FIG. 4: Scatterplot of maximum fitness value found by PSO 
(Y axis) against the value at the known signal location, tq = 
10.0 sec, ri.5 = 0.75 sec, for all signal SNR values. In one 
trial, the maximum fitness value (near 6.0 on the Y axis) dips 
below the line of equality (dashed). 



C. Signal detection and parameter estimation 



ness in the cluster is larger than the value at the signal 
location. Our second figure of merit, therefore, is the 
fraction of trials in which this occurs. Ideally, this figure 
of merit should be unity. 

Table HVl reports the first figure of merit for each com- 
bination of signal SNR and location. As expected, for the 
case of strong signals (SNR> 7) the probability of clus- 
tering is always, and often significantly, higher than 90%. 
For the weak signal SNR of 6.0, the probability of clus- 
tering has an average value of 91% which is statistically 
consistent with the pure noise case of 90%. 

As far as the second figure of merit is concerned, 
we find that it is unity for all combinations of signal 
SNR and locations except for one, namely, SNR = 8.0, 
To — 10.0 sec and ri.5 — 0.75 sec, for which it was 0.98. 
Fig. 2] shows the scatterplot between the maximum fit- 
ness found by PSO and the value at this signal location 
for all signal SNR values. It is seen that in one trial the 
maximum fitness fell below the value at the signal loca- 
tion. However, the two values are so close that the figure 
of merit should be considered to be practically unity for 
this case too. 

Taken together, the figures of merit show that PSO 
almost always terminates near the true global maximum 
when a sufficiently strong signal is present. When the 
signal is weak, we recover the performance ensured by 
the tuning procedure for the pure noise case. 



In order to cast the results obtained so far in terms of 
signal detection and parameter estimation performance, 
we choose the maximum fitness value found over the Nrep 
runs as the detection statistic and the location corre- 
sponding to the maximum fitness value as the estimator 
for the chirp time parameters. 

1. Detection 

It was discussed in Sec IIV CI that, after tuning PSO, 
the distribution of the detection statistic in trials with 
and without clustering remains the same. As the simul- 
taneous clustering of the two chirp-times and the fitness 
values indicates termination in the vicinity of the global 
maximum, it follows that the probability distribution of 
the PSO detection statistic is about the same as that of 
the global maximum. Strictly speaking, the PSO detec- 
tion statistic will always have a value less than the global 
maximum but, given our termination criterion, the rel- 
ative difference between the two is less than 3%. Thus, 
the false alarm probabilities, for a given detection thresh- 
old, corresponding to the PSO detection statistic and the 
true global maximum are also nearly the same, with the 
former being slightly smaller. 

In the presence of signals with an SNR of 8 or higher, 
which is the typical value sought in a real detection, it 
was shown that almost all trials exhibit clustering and 
that the detection statistic value was always higher than 
the fitness at the true signal location. Hence, the distri- 
bution of the detection statistic in the presence of a sig- 
nal also closely follows that of the true global maximum. 



11 



Strictly speaking, as with the false alarm probability, the 
detection probability will be slightly smaller for the PSO 
detection statistic, for a given threshold, as compared to 
that for the true global maximum. 

The above line of reasoning suggests that the Receiver 
Operating Characteristics (ROC) of the PSO detection 
statistic should nearly be the same as that of the true 
global maximum. The only way to rigorously verify this 
is to carry out simulations with a large number of tri- 
als in which both PSO and grid-based searches are per- 
formed. This is a computationally expensive task which 
we plan to undertake in the future. However, it is im- 
portant to note that such a comparison may not be pos- 
sible for searches that are too expensive for a grid-based 
search. 



2. Estimation 
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Table fVl summarizes the errors in the estimation of the 
chirp time parameters for signal SNR values > 8 at the 
different signal locations used in the simulations. Each 
entry in the table is an estimate of the root mean-square 
error (rmse) defined as, 



rmse(0) 



E 



(17) 



where 9 is the estimator of 6. The rmse includes the 
effects of both estimator variance and bias. 

Since the search region in the current testbed includes 
unphysical chirp time parameters, the global maximum 
and, hence, the estimated chirp times fall there in some 
trials. Fig. [S] shows an example where the estimates from 
all the trials are shown for a signal SNR of 8. In an 
improved implementation of PSO, blocking the unphysi- 
cal region should improve parameter estimation accuracy 
significantly. As an indicator of this, we also show in Ta- 
ble |Vl the rmse obtained by dropping the trials where the 
estimate fell in the unphysical region. It is seen that the 
errors are reduced significantly, especially for the lower 
signal SNR value for which there is more scatter into the 
the unphysical region. 

A comparison of Table |V] with existing results j20j 
shows that the estimation errors due to PSO are con- 
sistent with grid-based searches. 



D. Computational cost 

The number of fitness function evaluations for each 
combination of signal SNR and location are shown in 
Table. EH It is seen that for a signal SNR of 9.0, the 
maximum number of evaluations is about the same as the 
mean in the pure noise case (c.f., Table|l|. This reduction 
is consistent with the fact that a strong signal makes it 
easier for the swarm to find the global maximum. 



FIG. 5: Estimation of parameter values for a signal SNR of 
8.0. The true locations of the signals are indicated by the ★ 
marker and each of the markers, •, +, * and x, indicates an 
estimated location corresponding to one of the true locations. 
The association between the markers and the true signal loca- 
tions is indicated in the figure. For each true signal location, 
the simulation consisted of 50 trials. 



TABLE V: Signal parameter estimation errors with PSO. 
Each entry in the table is of the form a{b), where a and b are 
the estimated root mean square errors (rmse) for ro and ri.s 
respectively (expressed as a percentage of the true parameter 
value). In each row, the top and bottom pairs of numbers refer 
to mse obtained without and with the physical boundary cut 
respectively. All the numbers have been rounded off, given the 
expected precision from the 50 trials used per combination of 
signal SNR and location. 





(T"o,-ri.5) 
= (5.0,0.6) 


(10.0,0.75) 


(16.2,0.762) 


(20.0, 0.9) 


SNR=9.0 


2(13) 
2(12) 


1(11) 
0.5(6) 


0.3(6.0) 
0.2(4) 


1(11) 
0.2(4) 


8.0 


44(13) 


32(16) 


12(16) 


11(17) 


10.5(10) 


1(10) 


0.3(5) 


12(10) 



TABLE VI: Computational cost of PSO on data containing a 
signal. For each combination of signal SNR and location, the 
mean number of fitness function evaluations, over 50 trials, is 
listed along with the maximum (superscript) and minimum 
(subscript). All numbers are in units of lO** and rounded off 
to a single digit of precision. 



(to, n.s) 
(5.0,0.6) 



(10.0,0.75) (16.0,0.762) (20.0,0.9) 



SNR=9.0 


4.4i:i 


4.7!:I 


4.8t:l 


4.7l:i 


8.0 


4-03.8 


4.7i:f 


4.8t:f 


4.8t:! 


7.0 


4.7^ 


4.8i:^ 


4.9^:? 


4.8i| 


6.0 


4.5^i 


4.8l:E! 


4.7i:i 


4.7^;^ 
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For ground-based detectors, the dominant computa- 
tional cost comes from the pure noise case. Ahhough our 
tuning procedure produced iVp = 81 and Nt = 80 as the 
optimal combination, there is statistical uncertainty in 
this result due to the finite and somewhat small number 
of trials used. To make our estimate of the computational 
cost conservative, we use the combination Np = 121 and 
Nt = 80 instead for which all the performance measures 
are significantly better. From Table HI the typical number 
of fitness evaluations required for the testbed considered 
here is ~ 7 x 10'' with a spread of about ±2 x 10^. Of 
this, the termination criterion itself accounts for a fixed 
number, {Nt = 80) x {Np = 121) = 9680, of evaluations. 

A grid-based search provides a convenient perspective 
for evaluating the computational cost of PSO. According 
to Q, for 2PN waveform and initial LIGO noise PSD, 
the number of fitness function evaluations required in a 
single grid with a minimal match of 0.97 (=> a — 0.03) 
is 1.1 X 10^ if the minimum mass used for constructing 
the template waveforms is lAf©. In the current testbed, 
the search region in the massplane (c.f.. Fig. ^ is not 
the simple one considered in [5| although the minimum 
masses are similar. Additionally, Q uses an analytic fit 
for the noise PSD that differs from the one used here. 
Ignoring these differences we find that the current imple- 
mentation of PSO requires about 7 times as many eval- 
uations, on the average, as a grid-based method. 



VI. CONCLUSIONS 

We applied PSO to the binary inspiral testbed where 
the main challenge was to locate the global maximum 
of a highly multi-modal fitness function. Such functions, 
with an unpredictable number of extrema having random 
locations and sizes, are typical in GW data analysis. 

In response to the questions posed at the beginning, 
the results obtained from simulations show clearly that: 

1. PSO is a viable method for signal detection and 
estimation in GW data analysis as it can success- 
fully handle the challenge of high multi-modality 
presented by such problems. 

2. Good performance was achieved by tuning only two 
out of the nine design variables involved in the 
method. Thus, PSO is a stochastic method that 
offers the possibility of having a small number of 
design variables in practice. 

3. The design variables were tuned using a systematic 
procedure that does not require any prior informa- 
tion about features of the fitness function. As such, 
the procedure should be widely applicable to other 
stochastic methods also. 

4. PSO is about 7 times more expensive than a grid- 
based search in the number of fitness function eval- 
uations required. 



The higher cost of PSO is not surprising since grid-based 
searches are usually more efficient than stochastic meth- 
ods in low-dimensional problems such as the one consid- 
ered here. The performance gain of stochastic methods 
appears due to the slower rise in their computational cost, 
with increase in dimensionality, compared to the expo- 
nential one of grid-based searches. Therefore, we expect 
PSO to be cheaper than grid-based searches in higher 
dimensional problems. However, a definitive answer re- 
quires an actual test on problems such as the inspiral 
of high mass spinning binary components or the LISA 
Galactic Binary resolution problem 2l[ . The demonstra- 



tion in this paper that PSO can handle the more serious 
challenge of high multi-modality is the first step towards 
such future investigations. 

The computational cost of PSO may be significantly 
reduced by taking into account the physical boundary in 
parameter space (see Fig. [5]). The current implementa- 
tion of PSO requires the search to extend over a large 
unphysical region. In fact, as far as the binary inspiral 
problem goes, we find this to be the most outstanding is- 
sue. We have tried the invisible walls condition with the 
curved physical boundary but find that the performance 
of PSO is negatively affected. Specifically, termination 
takes a much longer time and the probability of cluster- 
ing is significantly reduced. This behavior is attributable 
to the curved shape of the boundary allowing a signif- 
icantly larger number of particles to escape the search 
domain. Once outside, particles contribute nothing to 
the search and keep moving until they are pulled back. 

To solve this problem, it appears inevitable that the 
dynamical equations of PSO must be modified. For sig- 
nals other than binary inspirals, such as Galactic Binaries 
in the case of LISA, the nature of the boundary problem 
would be different and it may not be an issue in some 
applications. 

Finally, a comment about the use of Gaussian, station- 
ary noise in the testbed. We emphasize here that PSO 
is a method for finding the global maximum of a fitness 
function irrespective of what produces that peak, a gen- 
uine GW signal or an instrumental transient. Since, the 
implementation of PSO in this paper uses no prior infor- 
mation about features of the fitness function, it should 
find the peak regardless of its source. Thus, there should 
be no significant difference in the performance of PSO 
and a grid-based method even for non-stationary, non- 
Gaussian noise. In future work, we will verify this ex- 
plicitly by using non-GW signals in our simulations. 
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Appendix A: Details of the signal waveform 

1. Chirp-time parameters 

The chirp-time parameters, {ra}, a = 0,1,1.5,2, are 
given in terms of the masses, mi and 7712 < mi, of the 
binary components (we use c — G = 1), 



where Al — mi TO2 is the total mass of the compact 
binary, fi — mim2/M is the reduced mass and -q — ^jl/M. 

Since all the chirp time parameters depend on mi and 
TO2, only two of them are independent. It is convenient 
to choose To and ri.5 as the independent parameters since 
M and /i can be obtained algebraically from them. 
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allowing ti and T2 to be obtained algebraically from tq 
and ri .5. 
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2. The phase function 



In Eq. [HI the function ?/'(/; 9) is given by, 
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The functions Ua, a = 0, 1, 1.5, 2 and the /"''^^ factor in 
the amplitude of the signal (Eq. [S]) can be pre-computed 
and stored, reducing the computational cost of generat- 
ing waveforms. 
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